Detection of and responses to network attacks

ABSTRACT

Disclosed are various embodiments for detecting and responding to attacks on a computer network. One embodiment of such a method describes monitoring dropped data communications intended for a target class of first virtual machine nodes; determining whether a dropped data communication is a form of attack on a network to which the first virtual machine nodes are connected; and sending a notification message of the determined attack to a data transmission system manager node thereby causing the data transmission system manager node to generate a list of one or more internet protocol addresses associated with a source of the dropped data communication and send the list of one or more internet protocol addresses to at least one second transmission manager node for second virtual machine nodes that are not part of the target class

CROSS-REFERENCE TO RELATED APPLICATION

This application is a continuation of copending U.S. utility applicationentitled, “Detection of and Responses to Network Attacks,” having Ser.No. 13/930,507, filed Jun. 28, 2013, which is a continuation ofcopending U.S. utility application entitled, “Detection of and Responsesto Network Attacks,” having Ser. No. 12/980,057, filed Dec. 28, 2010,U.S. Pat. No. 8,499,348, issued Jul. 30, 2013, both of which areentirely incorporated herein by reference.

BACKGROUND

One problem that arises in the context of data centers that virtually orphysically host large numbers of applications or systems for a set ofdiverse customers involves providing network isolation for the systemsoperated by or on behalf of each customer, so as to allow communicationsbetween those systems (if desired by the customer) while restrictingundesired communications to those systems from other systems.

BRIEF DESCRIPTION OF THE DRAWINGS

Many aspects of the present disclosure can be better understood withreference to the following drawings. The components in the drawings arenot necessarily to scale, emphasis instead being placed upon clearlyillustrating the principles of the disclosure. Moreover, in thedrawings, like reference numerals designate corresponding partsthroughout the several views.

FIG. 1 is a drawing of a Data Transmission Management system accordingto various embodiments of the present disclosure.

FIG. 2 is a drawing of an example computing system suitable forexecuting an embodiment of a DTM System Manager component of FIG. 1according to various embodiments of the present disclosure.

FIG. 3 is a drawing of an example computing system suitable forexecuting an embodiment of a Network Diagnostic System component of FIG.1 according to various embodiments of the present disclosure.

FIGS. 4A-4B illustrate examples of using group membership informationfor managing communications between computing nodes according to variousembodiments of the present disclosure.

FIG. 5 is a flowchart illustrating one example of a detection routineimplemented as portions of a Network Diagnostic System executed in acomputing device in the Data Transmission Management System of FIG. 1according to various embodiments of the present disclosure.

FIG. 6 is a flowchart illustrating one example of a response routineimplemented as portions of a Network Diagnostic System executed in acomputing device in the Data Transmission Management System of FIG. 1according to various embodiments of the present disclosure.

FIGS. 7A-7B illustrate examples of authorizing dynamic changes to bemade to a customer's access rights according to various embodiments ofthe present disclosure.

DETAILED DESCRIPTION

Techniques are described for dynamically updating access policies forcomputing nodes on a network upon discovering attacks on the network. Insome embodiments, the computing nodes include virtual machine nodes thatare hosted on one or more physical computing machines or systems, andthe communications include transmissions of data (e.g., messages, datapackets or frames, etc.) between nodes hosted on the same physicalmachine or distinct physical machines over one or more networks.

FIG. 1 is a network diagram illustrating an example embodiment in whichmultiple Transmission Manager (“TM”) components manage communicationsbetween computing nodes, with the multiple TM components being part of aData Transmission Management (“DTM”) system 102 managing the datatransmissions of various computing nodes located within a data center100. In this example, data center 100 comprises a number of racks 105,which each include a number of physical computing systems 110 a-c and arack support computing system 122. The computing systems 110 a-c eachprovide one or more virtual machine nodes 120, which each may beemployed to provide an independent computing environment to hostapplications within the data center 100. In addition, the computingsystems 110 a-c each host a TM component node 115 that manages outgoingdata transmissions from virtual machine nodes 120 hosted on thecomputing system, as well as incoming data transmissions from othernodes (whether local or remote to the data center 100) to those hostedvirtual machine nodes on the computing system. In this exampleembodiment, the rack support computing system 122 provides utilityservices for computing systems local to the rack (e.g., data storageservices, network proxies, application monitoring and administration,etc.), as well as possibly other computing systems located in the datacenter, although in other embodiments such rack support computingsystems may not be used. The computing systems 110 a-c and the racksupport computing system 122 of a rack in this example all share acommon, high-speed, rack-level network interconnect (e.g., via a sharedbackplane, one or more hubs and/or switches that are physically local orremote to the particular rack, etc.), not shown.

In addition, in at least some embodiments, the management of datatransmissions includes analyzing outgoing data transmissions that arerequested or otherwise initiated from a source node to one or moredestination nodes in order to determine whether the data transmissionsare authorized, such as under control of a TM component 125 associatedwith the source node, and with the data transmissions being allowed tocontinue over one or more networks to the destination node(s) only ifauthorization is determined to exist. The determination of authorizationby the TM component 125 may, for example, be based at least in part ondefined data transmission policies that specify groups of one or moresource nodes that are authorized to communicate with groups of one ormore destination nodes, such as when a source node and destination nodeboth belong to a common group of nodes.

In addition, the example data center 100 further comprises additionalcomputing systems 130 a-b and 135 that are not located on a rack, butshare a common network interconnect to a TM component 125 associatedwith those additional computing systems, although in other embodimentssuch additional non-rack computing systems may not be present. In thisexample, computing system 135 also hosts a number of virtual machinenodes, while computing systems 130 a-b instead act as a single physicalmachine node. The TM component 125 similarly manages incoming andoutgoing data transmissions for the associated virtual machine nodeshosted on computing system 135 and for computing system nodes 130 a-b.An optional DTM Group Manager component (not shown) may provide a numberof services to TM components local to the data center 100, such as tomaintain global state information for the TM components (e.g., groupmembership information, access policies, etc.).

In some embodiments, an application execution service executesthird-party customers' applications using multiple physical machines(e.g., in one or more data centers) that each host multiple virtualmachines or nodes 120 (which are each able to execute one or moreapplications for a customer), and the described techniques may be usedby one or more data transmission management systems executing as part ofthe application execution service to control communications to and fromthe applications of each customer. Customers may provide applicationsfor execution to the execution service and may reserve execution timeand other resources on physical or virtual hardware facilities providedby the execution service. In addition, customers may create new groupsof computing nodes (e.g., multiple computing nodes that are currentlyeach executing one of multiple instances of a program of the customer)and specify access policies for the groups. When specifying accesspolicies, customers may have the membership of the groups and/or thespecified access policies be updated (whether automatically or manually)to reflect changing conditions, such as detection of possible networkvulnerabilities and to reflect new application instances that areexecuted, previously executing application instances that are no longerexecuting, and/or new or adjusted access policies (e.g., to reflect newsecurity requirements, such as by changing whether access to othercomputing nodes, groups and/or applications is allowed or denied,possibly in response to an attack on a network or computing nodes).

In some embodiments, access policies describe source nodes (alsoreferred to as “sending nodes” or “senders”) that are allowed totransmit data to a particular destination node or group of nodes, suchas by describing such source nodes individually (e.g., via networkaddress or other identifier), via ranges of network addresses or otheridentifiers, as one or more groups of related source nodes, etc., whilein other embodiments access policies may instead, in a similar manner,describe destination nodes that are allowed to receive datatransmissions from one or more particular source nodes or groups ofnodes. In the absence of specified access policies and/or the ability todetermine that a particular initiated data transmission is authorized,some embodiments may provide default access policies and/orauthorization polices, such as to deny all data transmissions unlessdetermined to be authorized, or instead to allow all data transmissionsunless determined to not be authorized.

In one example embodiment, multiple transmission manager components (TM)115, 125 of a Data Transmission Management (“DTM”) system 102 worktogether to manage the data transmissions of a number ofintercommunicating participant computing nodes. Initially, when aparticipant computing node comes online, a TM component 125 associatedwith the participant node determines the node's network address (e.g.,Internet Protocol (IP) address) or other network location, any groups towhich the node belongs, and indications of source nodes that areauthorized to transmit data to the node. Later, when the participantnode attempts to initiate communication with a remote destination node,the associated TM 125 component detects the initiated communication, anddetermines whether authorization for the communication already existsbased on obtained authorization for a prior communication from theparticipant source node to the destination node. If existingauthorization is not available, the associated TM component 125 attemptsto negotiate authorization to communicate with the remote destinationnode, such as by communicating with a remote TM component 115 associatedwith the remote destination node (e.g., by sending a negotiation requestthat triggers the negotiation)—a negotiation request for a datatransmission from a participant source node to a destination node maycontain information related to the network identity and group membershipof the participant source node.

After the remote TM component 115 associated with the remote destinationnode receives a negotiation request on behalf of a source node, thecomponent determines whether the source node is authorized tocommunicate with the remote destination node based on any access and/ortransmission policies of the remote destination node (e.g., based on thegroups of which the remote destination node is a member). If it isdetermined that authorization exists, the remote TM component 115responds to the negotiation request with a reply indicating thatauthorization to communicate is provided. The TM component 125associated with the participant source node receives this reply, andproceeds to allow data to be transmitted to the remote destination node(whether by transmitting the data on behalf of the participant sourcenode, allowing a data transmission by the participant source node toproceed, etc.). If the reply instead indicates that authorization tocommunicate has not been obtained, the TM 125 associated with theparticipant source node proceeds to prevent the data transmission to thedestination node from occurring (whether by dropping or otherwisediscarding an intercepted data transmission, by indicating to theparticipant source node and/or others not to perform any datatransmissions to the destination node, etc.). In addition, the TMcomponent 125 associated with the participant source node may cache orotherwise store the result of the negotiation so that futuretransmissions do not require the additional step of negotiation, and theTM component 115 associated with the destination node may similarlycache or otherwise store the result of the negotiation. In this manner,Data Transmission Manager systems 102 dynamically determine whether theassociated computing nodes that they manage are authorized to transmitdata to various remote destination nodes.

In accordance with one embodiment, a Network Diagnostic System (NDS) 145is also illustrated at the interconnect, in accordance with oneembodiment, between the data center 100 local network and the externalnetwork 170, such as may be employed to provide a number of services(e.g., network proxies, the filtering or other management of incomingand/or outgoing data transmissions, etc.), including to analyze networkcommunications and attempt to discover attempts at network intrusion(e.g., attempted use of a computer system that exceeds authenticationlimits) or attacks from some or all nodes internal to the data center100 to nodes located in additional data centers 160 or other computingsystems 180 external to the data center 100. In some embodiments, aNetwork Diagnostic Network System component 146 may be located on aphysical machine 110 c with one or more virtual machine nodes and/orTransmission Manager component. Further, in some embodiments, a NetworkDiagnostic System component 147 may be located within respective virtualmachine nodes hosted on a physical machine 110 c. The network 170includes, for example, the Internet, intranets, extranets, wide areanetworks (WANs), local area networks (LANs), wired networks, wirelessnetworks, or other suitable networks, etc., or any combination of two ormore such networks.

For example, computing nodes 120, 130 a-b on an external network, suchas the Internet, often receive some form of malicious attack or attemptto compromise a security of the node 120, 130 a-b. A majority of suchattacks are not targeted and are random. With the aforementioned accesspolicies, operators of computing nodes 120, 130 a-b can restrict withwhom the nodes communicate. However, even with the access policies, acomputing node 120, 130 a-b may be open to being attacked. For example,a computing node 120, 130 a-b may configure its access policies toreceive communications from anywhere on the external network 170, sincethe node 120, 130 a-b may be offering services that are attempting to besold to potential customers that may exist anywhere on the externalnetwork 170. Therefore, embodiments of the Network Diagnostic System 145detect an attempt to compromise a security of computing node(s) 120, 130a-b by a malicious agent and cause an action to be implemented toprotect computing node(s) 120, 130 a-b which are vulnerable to actionsof the malicious agent.

The example data center 100 is connected to a number of other computingsystems via a network 170 (e.g., the Internet), including additionalcomputing systems 180 that may be operated by the operator of the datacenter 100 or third parties, additional data centers 160 that also maybe operated by the operator of the data center 100 or third parties, andan optional DTM System Manager system 150. In this example, the DTMSystem Manager 150 may maintain global state information for TMcomponents in a number of data centers, such as the illustrated datacenter 100 and additional data centers 160. The information maintainedand provided by the DTM System Manager may, for example, include groupmembership information, access policies, etc. Although the example DTMSystem Manager 150 is depicted as being external to data center 100 inthis example embodiment, in other embodiments it may instead be locatedwithin data center 100.

FIG. 2 is a block diagram illustrating an example computing systemsuitable for managing communications between computing nodes, such as byexecuting an embodiment of a DTM System Manager component 150. Inaccordance with one embodiment, the example computing system 200includes at least one central processing unit (“CPU”) 235, variousinput/output (“I/O”) devices 205, storage 240, and memory 245, with theI/O devices including a display 210, a network connection 215, acomputer-readable media drive 220, and other I/O devices 230. In otherembodiments, one or more components, such as display 210, may not bepresent in the computing system. In the illustrated embodiment, anexample DTM System Manager system 150 is executing in memory 245 inorder to maintain and provide information related to the operation ofone or more TM components 110, 125 (FIG. 1) (such as access policies andgroup membership), as discussed in greater detail elsewhere.

It is understood that there may be other applications or programs 255that are stored in the memory 245 and are executable by the centralprocessing unit 235 as can be appreciated. Where any component discussedherein is implemented in the form of software, any one of a number ofprogramming languages may be employed such as, for example, C, C++, C#,Objective C, Java, Javascript, Perl, PHP, Visual Basic, Python, Ruby,Delphi, Flash, or other programming languages.

Similarly, FIG. 3 is a block diagram illustrating an example computingsystem suitable for monitoring network communications between computingnodes, such as by executing an embodiment of a Network Diagnostic Systemcomponent. The example computing system 300 includes at least onecentral processing unit (“CPU”) 335, various input/output (“I/O”)devices 305, storage 340, and memory 345, with the I/O devices includinga display 310, a network connection 315, a computer-readable media drive320, and other I/O devices 330. In other embodiments, one or morecomponents, such as display 310, may not be present in the computingsystem. In the illustrated embodiment, an example Network DiagnosticSystem 145 is executing in memory 345 in order to maintain and provideinformation related to a status of computing nodes 120, 130 a-b (FIG.1), as discussed in greater detail elsewhere.

It is understood that there may be other applications or programs 355that are stored in the memory 345 and are executable by the centralprocessing units 335 as can be appreciated. Where any componentdiscussed herein is implemented in the form of software, any one of anumber of programming languages may be employed such as, for example, C,C++, C#, Objective C, Java, Javascript, Perl, PHP, Visual Basic, Python,Ruby, Delphi, Flash, or other programming languages.

It will be appreciated that computing systems 200, 300 are merelyillustrative and are not intended to limit the scope of the presentdisclosure. For example, computing system 200, 300 may be connected toother devices that are not illustrated, including one or more networkssuch as the Internet or via the World Wide Web (“Web”). More generally,a “node” or other computing system may comprise any combination ofhardware or software that can interact and perform the described typesof functionality, including without limitation desktop or othercomputers, database servers, network storage devices and other networkdevices, PDAs, cellphones, wireless phones, pagers, electronicorganizers, Internet appliances, television-based systems (e.g., usingset-top boxes and/or personal/digital video recorders), and variousother consumer products that include appropriate inter-communicationcapabilities. In addition, the functionality provided by the illustratedcomponents and systems may in some embodiments be combined in fewercomponents or distributed in additional components. Similarly, in someembodiments the functionality of some of the illustrated components maynot be provided and/or other additional functionality may be available.

The computing device 200, 300 may comprise, for example, a servercomputer or any other system providing computing capability.Alternatively, a plurality of computing devices 200, 300 may be employedthat are arranged, for example, in one or more server banks or computerbanks or other arrangements. For example, a plurality of computingdevices 200, 300 together may comprise a cloud computing resource, agrid computing resource, and/or any other distributed computingarrangement. Such computing devices 200, 300 may be located in a singleinstallation or may be distributed among many different geographicallocations. For purposes of convenience, the computing device 200, 300 isreferred to herein in the singular. Even though the computing device isreferred to in the singular, it is understood that a plurality ofcomputing devices may be employed in the various arrangements asdescribed above.

The advent of virtualization technologies for commodity hardware hasprovided a partial solution to the problem of managing large-scalecomputing resources for many customers with diverse needs, allowingvarious computing resources to be efficiently and securely sharedbetween multiple customers. For example, virtualization technologiessuch as those provided by VMWare, XEN, or User-Mode Linux may allow asingle physical computing machine to be shared among multiple users byproviding each user with one or more virtual machines hosted by thesingle physical computing machine. Each such virtual machine may be asoftware simulation acting as a distinct logical computing system thatprovides users with the illusion that they are the sole operators andadministrators of a given hardware computing resource, while alsoproviding application isolation and security among the various virtualmachines. Furthermore, some virtualization technologies are capable ofproviding virtual resources that span one or more physical resources,such as a single virtual machine with multiple virtual processors thatactually spans multiple distinct physical computing systems.

It will also be appreciated that, while various items are illustrated asbeing stored in memory or on storage while being used, these items orportions of them can be transferred between memory and other storagedevices for purposes of memory management and data integrity.Alternatively, in other embodiments some or all of the softwarecomponents and/or systems may execute in memory on another device andcommunicate with the illustrated computing system via inter-computercommunication. Some or all of the components, systems and datastructures may also be stored (e.g., as software instructions orstructured data) on a computer-readable medium, such as a hard disk, amemory, a network, or a portable media article to be read by anappropriate drive or via an appropriate connection. Such computerprogram products may also take other forms in other embodiments.Accordingly, embodiments of the present disclosure may be practiced withother computer system configurations.

The memory 245, 345 is defined herein as including both volatile andnonvolatile memory and data storage components. Volatile components arethose that do not retain data values upon loss of power. Nonvolatilecomponents are those that retain data upon a loss of power. Thus, thememory 245, 345 may comprise, for example, random access memory (RAM),read-only memory (ROM), hard disk drives, solid-state drives, USB flashdrives, memory cards accessed via a memory card reader, floppy disksaccessed via an associated floppy disk drive, optical discs accessed viaan optical disc drive, magnetic tapes accessed via an appropriate tapedrive, and/or other memory components, or a combination of any two ormore of these memory components. In addition, the RAM may comprise, forexample, static random access memory (SRAM), dynamic random accessmemory (DRAM), or magnetic random access memory (MRAM) and other suchdevices. The ROM may comprise, for example, a programmable read-onlymemory (PROM), an erasable programmable read-only memory (EPROM), anelectrically erasable programmable read-only memory (EEPROM), or otherlike memory device.

FIGS. 4A-4B illustrate examples of using group membership informationfor managing communications between computing nodes. The dataillustrated in FIGS. 4A and 4B may be maintained and provided in variousmanners, such as by the DTM System Manager system 150 shown in FIG. 1and/or by one or more of various TM components (e.g., in a distributedmanner without use of a central system).

FIG. 4A depicts a table 400 that contains membership information formultiple node groups. In particular, each data row 404 b-404 f describesa membership association between a node denoted in column 402 a and agroup denoted in column 402 b. Thus, for example, rows 404 c and 404 dindicate that node group Group2 includes at least nodes A and B, androws 404 e and 404 f indicate that node D is a member of at least twogroups. For illustrative purposes, the nodes in the present example areall indicated by single letters, such as ‘A’, ‘B’, ‘C’, etc., althoughthey could instead be indicated in other ways in other embodiments, suchas Internet Protocol (“IP”) addresses, DNS domain names, etc. Similarly,groups are indicated in the present example by strings such as “Group1”,but various other types of names may be used, and in at least someembodiments, users may be able to specify descriptive group names forgroups that they use. Column 402 c indicates that various types ofadditional information may be specified and used for groups, such asexpiration dates, contact information for the user that created orotherwise manages the group, etc.

FIG. 4B depicts a table 410 that specifies access rights associated withsome of the groups indicated in FIG. 4A. In particular, each data row414 b-414 g indicates a named sender in column 412 b that is authorizedto act as a source node to transmit data to any node that is a member ofthe group named in column 412 a. In the present example, such accessrights may be specified specific to a particular transmission protocol(e.g., Remote Desktop Protocol (RDP), Secure Shell protocol (SSH), MySQLprotocol, HTTP Secure (HTTPS), etc.), with three example protocolsshown, those being HyperText Transfer Protocol (HTTP) 412 c, FileTransfer Protocol (FTP) 412 d, and Simple Mail Transport Protocol (SMTP)412 e. In addition, senders may be identified in three different mannersin the present example, including by IP address, by IP address range, orby group name, although other naming conventions may be employed inother embodiments (e.g., DNS domain names). For example, row 414 bindicates that sending nodes that have IP addresses in the range0.0.0.0/0 (used here to represent all hosts) may initiate communicationsusing the HTTP protocol to nodes that are members of Group1, but thatsuch sending nodes may not initiate communication to nodes that aremembers of Group1 using either the FTP or SMTP protocol. Row 414 c showsthat source nodes that are members of Group1 may initiate communicationsto nodes that are members of Group2 using the HTTP protocol, but not theFTP or SMTP protocol. Row 414 d shows that source nodes that are membersof Group3 may initiate communication to nodes that are members of Group2using the HTTP or SMTP protocols, but not the FTP protocol. Row 414 eshows that the single source node with the IP address 192.25.1.23 mayinitiate communication with member nodes of Group2 using any of thethree listed protocols. Subsequent rows 414 f-4146 contain descriptionsof additional access policies. Column 412 f indicates that additionalinformation may be specified with respect to access policies (e.g.,additional protocols, types of operations, types of data formats, policyexpiration criteria such as timeouts, contact information for the userthat created or otherwise manages the policy, etc.).

In the example shown in FIG. 4B, access policies may be specified on aper-transmission protocol basis. In the present example, when a sourceis granted access via a particular protocol, such as HTTP, this may betaken to mean that the sender may send Transmission Control Protocol(“TCP”) packets to nodes in the specified group at the default port forHTTP, port 80. Other embodiments may allow access rights to be specifiedat other levels of details, such as to not indicate particularprotocols, or to further specify particular ports for use withparticular protocols. For example, some embodiments may allow accessrights to more generally be specified with respect to any transmissionproperties of particular network transmissions, such as types of packetswithin particular protocols (e.g., TCP SYN packets, broadcast packets,multicast packets, TCP flags generally, etc.), connection limits (e.g.,maximum number of concurrent connections permitted), packet size, packetarrival or departure time, packet time-to-live, packet payload contents(e.g., packets containing particular strings), etc. In addition, otherembodiments may specify access policies in various manners. For example,some embodiments may provide for the specification of negative accesspolicies, such as ones that specify that all nodes except for thespecified senders have certain access rights. Also, differentembodiments may provide varying semantics for default (unlisted) accesspolicies. For example, some embodiments may provide a default policythat no sender may communicate with nodes of a given group unlessauthorized by a particular other policy, while other embodiments mayprovide a default policy that senders operated by a given user may bydefault communicate with any other nodes operated by the same user, orthat nodes in a given group may by default communicate with other nodesin the same group. Finally, various embodiments may specify groups andgroup membership in various ways, such as by providing for hierarchiesof groups or to allow for groups to be members of other groups, suchthat a policy would apply to any node below an indicated point in thehierarchy or to any node that is a member of a indicated group or of anysub-groups of the indicated group.

Referring next to FIG. 5, shown is a flowchart that provides one exampleof the operation of a portion of the Network Diagnostic System 145according to various embodiments. It is understood that the flowchart ofFIG. 5 provides merely an example of the many different types offunctional arrangements that may be employed to implement the operationof the portion of the Network Diagnostic System 145 as described herein.As an alternative, the flowchart of FIG. 5 may be viewed as depicting anexample of steps of a method implemented in the computing device 300(FIG. 3) according to one or more embodiments.

In box 505, one exemplary detection routine includes the NetworkDiagnostic System 145 monitoring a target class of computing nodes. Forexample, in one embodiment, the target class may include anaddress-space of computing nodes that have not been used to legitimatelycommunicate with or receive communications from other computing nodes.The address-space may have been set aside for monitoring purposes and isnot intended for actual use. Accordingly, there is no legitimate reasonfor another computing node to attempt communications with a node in this“dark” address space. As such, any traffic monitored by the NetworkDiagnostic System 145 to this class of targets is likely a form of portscanning attack or other suspicious activity, as previously mentioned.

Beyond monitoring unused computing nodes and node addresses, a secondimplementation involves monitoring computing nodes and node addressesthat are not currently allocated to customers but are intended to beused by future customers. For example, in a public data center operatedby entities as businesses that provide access to computing resources tocustomers, there is a churn in customer use of the available nodeaddresses. At any particular time, some node addresses (e.g., IPaddresses) are going to be in use and some are not, where the ones thatare not in current use have been used in the past by past customers.Accordingly, in this particular implementation, the target class ofcomputing nodes includes computing nodes that are not currentlyallocated to customers. Unlike a situation where the set of computingnodes have gone unused, any communications or traffic to a computingnode within this address-space can not be assumed to be suspicious,since a received communication may have been the result of previousrelationship with a customer or user who used to but no longer uses thecomputing node. Therefore, an aggregate of detected activities of asuspected malicious agent may be considered before determining the agentor source to be malicious. Accordingly, if the same traffic orcommunication is detected from the agent to a particular number (e.g.,12) unallocated node addresses, then the activity may be determined toexceed a set threshold (e.g., 11) which may be a determinative factor incategorizing the communication as a network attack. In other words, aninstance of network communications is correlated by the NetworkDiagnostic System 145 with other instances and recognized to be apattern of activity by a malicious agent.

A third possible implementation in monitoring for network attacks is tomonitor dropped communications and traffic from allocated computingnodes. As discussed above, customers using the computing nodes defineand specify groups of one or more source nodes that are authorized tocommunicate with groups of one or more destination nodes, such as when asource node and destination node both belong to a common group of nodes,as illustrated in FIG. 4. Unauthorized communications to a computingnode are dropped or discarded by the data transmission managerassociated with the computing node. In this implementation, a targetclass of computing nodes includes allocated computing nodes with droppedcommunications by the Network Diagnostic System 145.

For example, one customer may have an access policy that disallows anySSH traffic or only allows SSH traffic from a specific IP address (orrange). Upon monitoring the customer's dropped communications, one mayfind that a single IP address 1.2.3.4 tried to connect with the customerusing SSH. Further, after monitoring other allocated customers' droppedcommunications, IP address 1.2.3.4 is noted to have tried to connectwith 100 other customers who do not permit SSH communications.Accordingly, the node at address 1.2.3.4 appears to be scanning the SSHports of computing nodes.

While a small number of received communications from IP address 1.2.3.4using SSH to nodes that have blocked SSH communications may belegitimate, an aggregate of dropped communications from one nodes to aparticular number (e.g., 100) of allocated nodes who have blocked thistype of communication may be determinative factor in categorizing thecommunication as a network attack. In other words, a pattern of activityby the malicious agent is monitored and detected. Further, in someembodiments, multiple sources of non-legitimate traffic may bedetermined by the Network Diagnostic System 145 to be related andtherefore may be treated as belonging to one source node. For example,the multiple sources may share a common feature or trait.

In some embodiment, the Network Diagnostic System 145 interpretsinformation gathered and observed to determine if network usage isnormal or abnormal. Accordingly, the Network Diagnostic System 145 maymonitor network behavior to determine what normal usage appears as andto be used as a baseline for which to compare suspected behavior. Inother embodiments, Network Diagnostic System 145 may utilize known orpredefined patterns of activity patterns and behavior that have beenpreviously identified as suspicious or malicious to compare withmonitored activity. Further, in some embodiments, the Network DiagnosticSystem 145 utilizes a heuristic approach and learns over time what typesof traffic patterns are considered normal for a network and then watchesfor anomalies in the traffic pattern. The methodology used by theNetwork Diagnostic System accordingly may be adapted over time asnetwork patterns change.

In box 510, the Network Diagnostic System 145 detects suspicious ornon-legitimate communications from an agent or source to/from the targetclass of computing nodes, as discussed above, and in box 520, recordsinformation describing the suspicious or non-legitimate communicationsfrom the agent that have been detected. As discussed above, dependingupon the target class of nodes being used, an aggregate number ofincidents of non-legitimate communications meeting a threshold value mayhave to be observed before a determination (525) is made to categorizecommunications from the agent as suspicious or as a possible networkattack. The threshold value may be predefined or may be the result ofheuristic computations that take into account different networkattributes or characteristics (e.g., a type of transmission protocolused in a communication).

In box 530, a notification of the suspicious or non-legitimatecommunications and information on the particulars of the communicationsis sent to the appropriate destination (e.g., DTM System Manager). Anotification may further be sent to the appropriate destination (e.g.,DTM System Manager) when the agent is observed or detected to havestopped or discontinued communications to/from the target class in boxes540-550.

After detection of activities of a malicious agent on a network 170 withregard to a subset of computing nodes on the network, securityprotections then may be dynamically implemented for remaining computingnodes on the network before these computing nodes encountercommunications from the malicious agent. For example, a malicious agentmay be in the process of scanning for vulnerabilities within a pluralityof computing nodes and its activities may be detected after scanning oneor more nodes by the Network Diagnostic System. Upon recognizing themalicious nature of the communications from the agent, new securitymeasures may be implemented (while the agent is still in the process ofscanning) for the plurality of computing nodes protecting againstcommunications from the agent or against communication of the type beingused by the agent (e.g., an attack via RDP protocol), while the networkattack is still in progress by the malicious agent. In one embodiment,the new security measures that were implemented (by DTM System Manager150) may be dynamically removed after the threat of the network attackfrom the agent is gone or dissipated (e.g., the agent has discontinuedcommunications). Therefore, consider an agent or source that is scanninga fleet of 1000 computers. The agent scans computer 1 and computer 2. Bythe time the agent attempts communications with computer 3, everycomputer in the fleet has been protected from communications with theagent, while the network attack continues. In one embodiment, the set ofcomputing nodes that is to be protected from a network attack is asuperset of the target class of computing nodes. In other embodiments,the set of computing nodes to be protected are not a superset but mayoverlap with the target class of computing nodes.

Referring next to FIG. 6, shown is a flowchart that provides one exampleof the operation of a portion of the DTM System Manager System 150according to various embodiments. It is understood that the flowchart ofFIG. 6 provides merely an example of the many different types offunctional arrangements that may be employed to implement the operationof the portion of the DTM System Manager System 150 as described herein.As an alternative, the flowchart of FIG. 6 may be viewed as depicting anexample of steps of a method implemented in the computing device 200(FIG. 2) according to one or more embodiments.

In box 605, an exemplary response routine includes the DTM SystemManager 150 receiving notification of detected non-legitimate orsuspicious communications within a target class of computing nodes. Thenotification may identify the source of the communications (e.g., IPaddress), intended destination of the communications (e.g., IP address),a categorization of the type of the network attack, the form ofcommunication being used (e.g., communications using RDP protocol), etc.In box 610, the DTM System Manager 150 determines which computing nodesare vulnerable to the particular network attacks and in box 620, the DTMSystem Manager 150 implements security protections or measures inresponse to receiving the notification to protect computing nodesvulnerable to the particular network attack.

In one embodiment, access protocols of customers may be enhanced tocover being protected from the particular network attack. For example, acustomer may have specified access rights for a group indicating thatall computing nodes may act as a source node and transmit to any memberof the group using RDP protocol. This set of access rights will notprotect the group from the network attack described above from amalicious agent using the RDP protocol. Therefore, the DTM systemmanager 150 may determine the node of the customer to be vulnerable toan attack and dynamically update the customer's access rights to beprotected against a network attack. Accordingly, the rights may bechanged to disallow communications from the malicious agent or source orto prohibit communications under the RDP protocol, as possible examples.Accordingly, the tables previously discussed with regard to FIG. 4 maybe changed to reflect new settings to protect computer node(s) against adetected network attack. Accordingly, a specific exception to acustomer's defined access rights may be implemented to protect thecustomer from the detected network attack, such as blocking access to aspecific port from a specific IP address or address range. Afternotification is received that the detected network attack has gone orbeen dissipated, the DTM System Manager 150 changes or restores thesecurity measures back to the settings previously specified by thecustomer, in boxes 630-640.

In one embodiment, an option is provided to a customer to allow for orauthorize dynamic changes to be made to the customer's access rightswhen a new network attack is discovered for which a node of the customeris vulnerable. FIG. 7A depicts a table 700 that contains optioninformation for multiple nodes regarding whether to accept a form ofprotection against a network attack. In this particular implementation,authorization may be provided as to whether dynamic changes may beimplemented to access rights for a particular node. Each data row 706b-706 d describes a relationship between a node denoted in column 702and a protection option denoted in column 704. If an operator/customerassociated with a node chooses to allow for dynamic changes to theaccess rights associated with the node to be made, then the protectionoption denoted in column 704 is marked as “YES”. Otherwise, theprotection option denoted in column 704 is marked as “NO”. Thus, forexample, rows 706 b and 706 c indicate that nodes A and B are authorizedto receive protection in the form of dynamic changes being made to theirassociated access rights. Row 706 c indicates that node C is notauthorized to receive such protection. Therefore, if the customer hasnot provided authorization, then the customer's or user's access rightswill not be changed in accordance with the above-described methods.Accordingly, options 704 may be verified before access rights andsecurity measures defined by a customer or user are changed, in oneembodiment.

Customers/users may want to base choosing whether or not to authorizedynamic changes to its access rights after weighing the risks versus therewards of such changes. For example, the risk of blocking legitimatecommunications from a prospective client or consumer/customer mayoutweigh the risk of being subject to a network attack. In someembodiments, the option to authorize dynamic changes to access rightsmay be provided granularly, such as in regard to certain ports, andtherefore, may be authorized for some ports and not others.

FIG. 7B depicts a table 710 that specifies access policies on aper-transmission protocol basis, as earlier discussed with regard toFIG. 4B. In table 710, for different groups indicated in column 712,access rights are specified specific to particular transmission protocolto indicate whether communications may be received and transmitted on aport associated with the particular transmission protocol, with threeexample protocols shown, those being HTTP 714 a, FTP 716 a, and RDP 718a. Columns 714 b, 716 b, 718 b indicate whether protections areauthorized to be implemented to protect the authorized transmissionprotocols 714 a, 716 a, 718 a in the table, in accordance withembodiments of the present disclosure. Thus, for example, row 720 bindicates that nodes having access policies defined by Group1 areauthorized to utilize HTTP protocols and are not authorized to utilizeFTP and RDP protocols. For the authorized HTTP protocol, protectionshave not been authorized to dynamically change the access rightsassociated with the HTTP protocols and associated port(s) to protectagainst a network attack. Since the FTP and RDP protocols are notauthorized and the communications on these ports have beenblocked/disallowed in accordance with access policies defined by acustomer/user, the content of fields of columns 716 b and 718 b do notneed to be considered and are not subject to being dynamically changed.

Next, row 720 c indicates that nodes having access policies defined byGroup2 are authorized to utilize HTTP and RDP protocols and are notauthorized to utilize FTP protocols. For the authorized HTTP protocol,protections have been authorized to dynamically change the access rightsassociated with the HTTP protocols and associated port(s) to protectagainst a network attack. In contrast, for the authorized RDP protocol,protections have been authorized to dynamically change the access rightsassociated with the RDP protocols and associated port(s) to protectagainst a network attack. Since the FTP protocols are not authorized andthe communications on these ports have been blocked/disallowed, thecontent of the field of column 716 b does not need to be considered andis not subject to being dynamically changed. In one embodiment, thetables of FIG. 7 may be reviewed to verify whether a customer/user hasauthorized dynamic changes to be made to associated access rights inresponse to a network attack.

One embodiment restores security measures to an earlier state beforechanges were made as earlier discussed with regards to box 640 of FIG.6. For example, a customer's or user's access rights may be modified todisallow communications on a data transmission port (via a datatransmission protocol) where the customer's access rights specify thatsuch communication is allowed. Accordingly, after a network attack isdetermined to have stopped, the customer's access rights may be changedback to allow for communications on the data transmission port (via thedata transmission protocol).

In other embodiments, the access rights may be changed for a definedperiod of time and after the defined period of time expires, then theaccess rights are restored. Therefore, in this type of embodiment,stoppage of the network attack or of the suspicious or non-legitimatecommunications does not need to be detected and relayed to the DTMsystem manager 150.

Additional responses to a detected network attack may also be executedin various embodiments. Possible responses include collectinginformation on the suspicious communications, such as when, where,which, and how many episodes or incidents have been attempted by thesource of the communications, changing access rights, spoofing repliesto the source, providing reports to an owner of the source and/or toowner(s) of nodes that have been attacked, publish an identity of thesource to community blacklists accessible from external network 170, andtaking a counteraction against the source to attempt to disable thesource. Further, responses could include sending notifications to systemadministrators and logging the communication activity of the source asan audit trail to provide evidence for post-attack analysis. If a sourceis determined to be a customer of the data center hosting the NetworkDiagnostic System 145 and DTM System Manager 150 applications, the DTMSystem Manager may automatically start a mitigation procedure againstthe source by blocking outgoing traffic and start internalinvestigations of fraud and abuse.

Various embodiments may provide mechanisms for customer users and otherusers to interact with an embodiment of the DTM system 102. For example,some embodiments may provide an interactive console (e.g. a clientapplication program providing an interactive user interface, a Webbrowser-based interface, etc.) from which users can manage the creationor deletion of groups and the specification of communication accesspolicies or group membership, as well as more general administrativefunctions related to the operation and management of hosted applications(e.g., the creation or modification of user accounts; the provision ofnew applications; the initiation, termination, or monitoring of hostedapplications; the assignment of applications to groups; the creation ofaccess rights for groups, the authorization of implementing dynamicchanges to the access rights, the reservation of time or other systemresources; etc.). In addition, some embodiments may provide an API(“application programming interface”) that allows other computingsystems and programs to programmatically invoke such functionality. SuchAPIs may be provided by libraries or class interfaces (e.g., to beinvoked by programs written in C, C++, or Java) and/or network serviceprotocols such as via Web services.

In addition, various implementation architectures are possible forembodiments of the DTM system 102. In some embodiments, multiple NetworkDiagnostic System and DTM System Manager components may act in adistributed manner to each monitor and manage the data transmissions ofone or more associated nodes, whether by each operating as anindependent autonomous program or by cooperating with other NetworkDiagnostic System and DTM System Manager components, and may possibly behosted virtual machines on the same computing system as the nodes beingmanaged or may instead operate on computing systems remote from thenodes that they manage. In still other embodiments, the functionality ofa Network Diagnostic System component 145 may be distributed, such as bybeing incorporated into each of the computing nodes being monitored, ora distinct DTM System Manager component 150 may operate on behalf ofeach computing node. In one embodiment, a Network Diagnostic Systemcomponent may be installed in an unallocated computing node within adata center environment that is removed once the node is allocated to acustomer. In another embodiment, a Network Diagnostic System componentmay be installed as a part of each computing node, whether allocated orunallocated.

In other embodiments, a single, central DTM System Manager component,Network Diagnostic System component, or other component may manage thereceived notifications from Network Diagnostic System components andimplementing security measures in response to these notifications for alarge number of computing nodes (e.g. an entire data center). Further, asingle or a small number of, central DTM System Manager components orother components may monitor network traffic at an edge of the datacenter at an interconnect with the external network 170 or other networkchoke point.

As previously noted, the described techniques may be employed on behalfof numerous computing nodes to provide various benefits to thosecomputing nodes. In addition, such computing nodes may in at least someembodiments further employ additional techniques on their own behalf toprovide other capabilities, such as by each configuring and providingtheir own firewalls for incoming communications, anti-virus protectionand protection against other malware, etc.

When the described techniques are used with a group of computing nodesinternal to some defined boundary (e.g., nodes within a data center),such as due to an ability to obtain access to the data transmissionsinitiated by those computing nodes, the described techniques may also insome embodiments be extended to the edge of the defined boundary. Thus,in addition to monitoring data transmissions between computing nodeswithin the defined boundary, one or more Network Diagnostic Systemcomponents that may access and monitor communications passing throughthe boundary between internal and external computing nodes may similarlyprovide at least some of the described techniques for thosecommunications. For example, when a data communication is received atthe boundary from an external computing node that is intended for aninternal computing node, a Network Diagnostic System 145 componentassociated with the edge may similarly monitor communications at theedge for network attacks and communicate with a DTM System manager 150as network attacks are detected. Possible network attacks and trafficpatterns that represent hostile actions or misuse include, but are notlimited to, denial of service attacks, man in the middle attacks, IPspoofing, port scanning, packet sniffing, worms, backscatter, maliciouscontent in data payloads, trojans, viruses, tunneling, brute-forceattacks, etc.

In one implementation, traffic intended for unallocated computing hostsmay be identified and routed to a specific computing host hosting theNetwork Diagnostic System 145. With communications to unallocatedcomputing hosts, privacy concerns in reviewing contents of thecommunications are minor or non-existence, since the communications arenot legitimate. Identification of the traffic may occur at the edge ofthe network.

For example, IP addresses of sources and destinations can be monitoredand for destinations that are determined to be unallocated or unused,the communications may be routed to the Network Diagnostic System 145.Further, in some embodiments, the Network Diagnostic System 145 may belocated at the edge and track the IP addresses of sources that arerepeatedly attempting to communicate using blocked protocols todestination hosts. Such tracking may be implementing by network hardwareat the edge, in addition to network software. This type of monitoringwould not violate a customer's privacy since the contents of thecommunications is not being monitored.

In one embodiment, the Network Diagnostic System 145 communicates withthe data transmission managers associated with the allocated computinghosts to receive dropped communications or to review logs from the datatransmission managers describing the dropped communications.

Those skilled in the art will also realize that although in someembodiments the described techniques are employed in the context of adata center housing multiple intercommunicating nodes, otherimplementation scenarios are also possible. For example, the describedtechniques may be employed in the context an organization-wide intranetoperated by a business or other institution (e.g. university) for thebenefit of its employees and/or members. Alternatively, the describedtechniques could be employed by a network service provider to improvenetwork security, availability, and isolation. In addition, exampleembodiments may be employed within a data center or other context for avariety of purposes. For example, data center operators or users thatsell access to hosted applications to customers may in some embodimentsuse the described techniques to provide network isolation between theircustomers' applications and data; software development teams may in someembodiments use the described techniques to provide network isolationbetween various environments that they use (e.g., development, build,test, deployment, production, etc.); organizations may in someembodiments use the described techniques to isolate the computingresources utilized by one personnel group or department (e.g., humanresources) from the computing resources utilized by another personnelgroup or department (e.g., accounting); or data center operators orusers that are deploying a multi-component application (e.g., amulti-tiered business application) may in some embodiments use thedescribed techniques to provide functional decomposition and/orisolation for the various component types (e.g., Web front-ends,database servers, business rules engines, etc.). More generally, thedescribed techniques may be used to partition virtual machines toreflect almost any situation that would conventionally necessitatephysical partitioning of distinct computing systems.

Although Network Diagnostic System 145 and DTM System Manger 155, andother various systems described herein may be embodied in software orcode executed by general purpose hardware as discussed above, as analternative the same may also be embodied in dedicated hardware or acombination of software/general purpose hardware and dedicated hardware.If embodied in dedicated hardware, each can be implemented as a circuitor state machine that employs any one of or a combination of a number oftechnologies. These technologies may include, but are not limited to,discrete logic circuits having logic gates for implementing variouslogic functions upon an application of one or more data signals,application specific integrated circuits having appropriate logic gates,or other components, etc. Such technologies are generally well known bythose skilled in the art and, consequently, are not described in detailherein.

The flow charts of FIGS. 5 and 6 show the functionality and operation ofan implementation of portions of the Network Diagnostic System 145 andDTM System Manger 155. If embodied in software, each block may representa module, segment, or portion of code that comprises programinstructions to implement the specified logical function(s). The programinstructions may be embodied in the form of source code that compriseshuman-readable statements written in a programming language or machinecode that comprises numerical instructions recognizable by a suitableexecution system such as a processor or CPU 335, 435 in a computersystem or other system. The machine code may be converted from thesource code, etc. If embodied in hardware, each block may represent acircuit or a number of interconnected circuits to implement thespecified logical function(s).

Although the flow charts of FIGS. 5 and 6 show a specific order ofexecution, it is understood that the order of execution may differ fromthat which is depicted. For example, the order of execution of two ormore blocks may be scrambled relative to the order shown. Also, two ormore blocks shown in succession in FIGS. 5 and 6 may be executedconcurrently or with partial concurrence. Further, in some embodiments,one or more of the blocks shown in FIGS. 5 and 6 may be skipped oromitted. In addition, any number of counters, state variables, warningsemaphores, or messages might be added to the logical flow describedherein, for purposes of enhanced utility, accounting, performancemeasurement, or providing troubleshooting aids, etc. It is understoodthat all such variations are within the scope of the present disclosure.

Also, any logic or application described herein, including NetworkDiagnostic System 145 and DTM System Manger 155, that comprises softwareor code can be embodied in any non-transitory computer-readable mediumfor use by or in connection with an instruction execution system suchas, for example, a processor or CPU 335, 435 in a computer system orother system. In this sense, the logic may comprise, for example,statements including instructions and declarations that can be fetchedfrom the computer-readable medium and executed by the instructionexecution system. In the context of the present disclosure, a“computer-readable medium” can be any medium that can contain, store, ormaintain the logic or application described herein for use by or inconnection with the instruction execution system. The computer-readablemedium can comprise any one of many physical media such as, for example,magnetic, optical, or semiconductor media. More specific examples of asuitable computer-readable medium would include, but are not limited to,magnetic tapes, magnetic floppy diskettes, magnetic hard drives, memorycards, solid-state drives, USB flash drives, or optical discs. Also, thecomputer-readable medium may be a random access memory (RAM) including,for example, static random access memory (SRAM) and dynamic randomaccess memory (DRAM), or magnetic random access memory (MRAM). Inaddition, the computer-readable medium may be a read-only memory (ROM),a programmable read-only memory (PROM), an erasable programmableread-only memory (EPROM), an electrically erasable programmableread-only memory (EEPROM), or other type of memory device.

One should also note that conditional language, such as, among others,“can,” “could,” “might,” or “may,” unless specifically stated otherwise,or otherwise understood within the context as used, is generallyintended to convey that certain embodiments include, while otherembodiments do not include, certain features, elements and/or steps.Thus, such conditional language is not generally intended to imply thatfeatures, elements and/or steps are in any way required for one or moreparticular embodiments or that one or more particular embodimentsnecessarily include logic for deciding, with or without user input orprompting, whether these features, elements and/or steps are included orare to be performed in any particular embodiment.

It should be emphasized that the above-described embodiments of thepresent disclosure are merely possible examples of implementations setforth for a clear understanding of the principles of the disclosure.Many variations and modifications may be made to the above-describedembodiment(s) without departing substantially from the spirit andprinciples of the disclosure. All such modifications and variations areintended to be included herein within the scope of this disclosure andprotected by the following claims.

Therefore, the following is claimed:
 1. A method comprising: monitoring,by a network diagnostic system node, a data communication dropped by afirst transmission manager node servicing a target class of firstvirtual machine nodes; determining, by the network diagnostic systemnode, that the dropped data communication is a form of attack on anetwork to which the first virtual machine nodes are connected; andsending, by the network diagnostic system node, a notification messageof the determined attack to a data transmission system manager nodethereby causing the data transmission system manager node to generate alist of one or more internet protocol addresses associated with a sourceof the dropped data communication and send the list of one or moreinternet protocol addresses to at least one second transmission managernode for second virtual machine nodes that are not part of the targetclass.